Speaker Clustering Based on Utterance-Oriented Dirichlet Process Mixture Model
نویسندگان
چکیده
This paper provides the analytical solution and algorithm of UO-DPMM based on a non-parametric Bayesian manner, and thus realizes fully Bayesian speaker clustering. We carried out preliminary speaker clustering experiments by using a TIMIT database to compare the proposed method with the conventional Bayesian Information Criterion (BIC) based method, which is an approximate Bayesian approach. The results showed that the proposed method outperformed the conventional one in terms of both computational cost and robustness to changes in tuning parameters.
منابع مشابه
Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model
We have proposed a novel speaker clustering method based on a hierarchically structured utterance-oriented Dirichlet process mixture model. In the proposed method, the number of speakers can be determined from the given data using a nonparametric Bayesian manner and intra-speaker variability is successfully handled by multi-scale mixture modeling. Experimental result showed that the proposed me...
متن کاملA sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large scale data
An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet pr...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملInfinite models for speaker clustering
In this paper we propose the use of infinite models for the clustering of speakers. Speaker segmentation is obtained trough a Dirichlet Process Mixture (DPM) model which can be interpreted as a flexible model with an infinite a priori number of components. Learning is based on a Variational Bayesian approximation of the infinite sequence. DPM model is compared with fixed prior systems learned b...
متن کاملSupervised Learning of Acoustic Models in a Zero Resource Setting to Improve DPGMM Clustering
In this work we utilize a supervised acoustic model training pipeline without supervision to improve Dirichlet process Gaussian mixture model (DPGMM) based feature vector clustering. We exploit methods common in supervised acoustic modeling to unsupervisedly learn feature transformations for application to the input data prior to clustering. The idea is to automatically find mappings of feature...
متن کامل